Prediction of Classifier Training Time Including Parameter Optimization
نویسندگان
چکیده
Besides the classification performance, the training time is a second important factor that affects the suitability of a classification algorithm regarding an unknown dataset. An algorithm with a slightly lower accuracy is maybe preferred if its training time is significantly lower. Additionally, an estimation of the required training time of a pattern recognition task is very useful if the result has to be available in a certain amount of time. Meta-learning is often used to predict the suitability or performance of classifiers using different learning schemes and features. Especially landmarking features have been used very successfully in the past. The accuracy of simple learners are used to predict the performance of a more sophisticated algorithm. In this work, we investigate the quantitative prediction of the training time for several target classifiers. Different sets of meta-features are evaluated according to their suitability of predicting actual run-times of a parameter optimization by a grid search. Additionally, we adapted the concept of landmarking to time prediction. Instead of their accuracy, the run-time of simple learners are used as feature values. We evaluated the approach on real world datasets from the UCI machine learning repository and StatLib. The run-time of five different classification algorithms are predicted and evaluated using two different performance measures. The promising results show that the approach is able to reasonably predict the training time including a parameter optimization. Furthermore, different sets of meta-features seem to be necessary for different target algorithms in order to achieve the highest prediction performances.
منابع مشابه
Support vector regression for prediction of gas reservoirs permeability
Reservoir permeability is a critical parameter for characterization of the hydrocarbon reservoirs. In fact, determination of permeability is a crucial task in reserve estimation, production and development. Traditional methods for permeability prediction are well log and core data analysis which are very expensive and time-consuming. Well log data is an alternative approach for prediction of pe...
متن کاملParameter Optimization of Kernel-Based One-Class Classifier on Imbalance Text Learning
Compared with conventional two-class learning schemes, one-class classification simply uses a single class in the classifier training phase. Applying one-class classification to learn from unbalanced data set is regarded as the recognition based learning and has shown to have the potential of achieving better performance. Similar to twoclass learning, parameter selection is a significant issue,...
متن کاملCall Classification with Hundreds of Classes and Hundred Thousands of Training Utterances ... ... and No Target Domain Data
This paper reports about an effort to build a large-scale call router able to reliably distinguish among 250 call reasons. Because training data from the specific application (Target) domain was not available, the statistical classifier was built using more than 300,000 transcribed and annotated utterances from related, but different, domains. Several tuning cycles including three re-annotation...
متن کاملSimulation of groundwater quality parameters using ANN and ANN+PSO models (Case study: Ramhormoz Plain)
One of the main aims of water resource planners and managers is to estimate and predict the parameters of groundwater quality so that they can make managerial decisions. In this regard, there have many models developed, proposing better management in order to maintain water quality. Most of these models require input parameters that are either hardly available or time-consuming and expensive to...
متن کاملSimulation of groundwater quality parameters using ANN and ANN+PSO models (Case study: Ramhormoz Plain)
One of the main aims of water resource planners and managers is to estimate and predict the parameters of groundwater quality so that they can make managerial decisions. In this regard, there have many models developed, proposing better management in order to maintain water quality. Most of these models require input parameters that are either hardly available or time-consuming and expensive to...
متن کامل